The whole alignment and nothing but the alignment: the problem of spurious alignment flanks
نویسندگان
چکیده
Pairwise sequence alignment is a ubiquitous tool for inferring the evolution and function of DNA, RNA and protein sequences. It is therefore essential to identify alignments arising by chance alone, i.e. spurious alignments. On one hand, if an entire alignment is spurious, statistical techniques for identifying and eliminating it are well known. On the other hand, if only a part of the alignment is spurious, elimination is much more problematic. In practice, even the sizes and frequencies of spurious subalignments remain unknown. This article shows that some common scoring schemes tend to overextend alignments and generate spurious alignment flanks up to hundreds of base pairs/amino acids in length. In the UCSC genome database, e.g. spurious flanks probably comprise >18% of the human-fugu genome alignment. To evaluate the possibility that chance alone generated a particular flank on a particular pairwise alignment, we provide a simple 'overalignment' P-value. The overalignment P-value can identify spurious alignment flanks, thereby eliminating potentially misleading inferences about evolution and function. Moreover, by explicitly demonstrating the tradeoff between over- and under-alignment, our methods guide the rational choice of scoring schemes for various alignment tasks.
منابع مشابه
gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملEffect of Objective Function on the Optimization of Highway Vertical Alignment by Means of Metaheuristic Algorithms
The main purpose of this work is the comparison of several objective functions for optimization of the vertical alignment. To this end, after formulation of optimum vertical alignment problem based on different constraints, the objective function was considered as four forms including: 1) the sum of the absolute value of variance between the vertical alignment and the existing ground; 2) the su...
متن کاملIT - Business Strategic Alignment and Organizational Agility: The Moderating Role of Environmental Uncertainty
This study investigates the effect of IT-business strategic alignment on organizational agility by considering the effects of IT flexibility and IT capability on strategic alignment. Also this study investigates the moderating role of environmental uncertainty on the relationship between strategic alignment and organizational agility. This research is an applied research based on purpose and de...
متن کاملAn Application of the ABS LX Algorithm to Multiple Sequence Alignment
We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...
متن کاملOPTIMIZATION OF VERTICAL ALIGNMENT OF HIGHWAYS IN TERMS OF EARTHWORK COST USING COLLIDING BODIES OPTIMIZATION ALGORITHM
One of the most important factors that affects construction costs of highways is the earthwork cost. On the other hand, the earthwork cost strongly depends on the design of vertical alignment or project line. In this study, at first, the problem of vertical alignment optimization was formulated. To this end, station, elevation and vertical curve length in case of each point of vertical intersec...
متن کامل